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SECTION  1 


INTRODUCTION 


1.1  BACKGROUND 

The  five  Cheyenne  Mountain  Upgrade  (CMU)  acquisitions  were  each  specified 
independently  with  no  end-to-end  performance  requirements.  In  May  1989,  the  CMU 
acquisitions  were  combined  under  a  single  Program  Management  Directive  (PMD)  for  CMU 
programs.  This  action  defined  the  CMU  as  a  subsystem  of  the  overall  Integrated  TW/AA 
system.  In  the  summer  of  1989,  five  key  CMU-level  performance  parameters  were  derived 
for  inclusion  in  the  CMU  Acquisition  Program  Baseline  (APB),  Decision  dloordination  Paper 
(DCP)  and  Test  and  Evaluation  Master  Plan  (TEMP).  This  action  was  accomplished  under 
ESD/AFSPACECOM  coordination.  The  coordinated  parameters  were  then  adopted  by 
AFSPACZECOM  for  inclusion  in  the  CMU  System  Operational  Requirements  Document 
(SORD). 

Estimates  for  the  five,  key  CMU-level  performance  parameter  values  are  updated 
quarterly.  One  of  the  five  key  parameters  is  information  delivery  time,  (^estions  relating  to 
the  methodology  for  estimating  both  the  subsystem  transit  times  and  the  overall  information 
delivery  times  were  raised  and  a  request  to  review  the  methodology  was  made.  This  paper 
documents  the  updated  information  delivery  time  methodology. 


1.2  OBJECTIVES 

In  the  context  of  updating  the  CMU  performance  parameter  methodology,  the 
representation  of  each  subsystem's  message  transit  time  and  the  derivation  of  the  overall 
infonnation  deliveiy  time  from  these  representations  are  investigated.  The  goals  of  this  study 
are  to  develop  a  methodology  for  deriving  the  overall  CMU  information  delivery  time 
specification  from  the  transit  time  specifications  of  each  of  the  subsystems,  to  reevaluate  the 
assumption  of  the  normal  distribution  for  each  of  the  subsystem  transit  times,  to  identify 
alternative  distributions  for  the  subsystem  transit  times,  and  to  assess  other  related 
methodological  considerations. 


1.3  SCOPE 

Transit  time,  depicted  in  figure  1,  is  defined  as  the  time  a  message  takes  to  transit  a 
CMU  subsystem.  Information  delivery  time  is  the  total  time  from  message  input  to  the  CMU 
to  corresponding  CMU  message  output  to  the  Forward  User  disphy  systems.  The  suite  of 
sub:>ystems  that  contribute  to  the  overall  information  delivery  times  differ  for  the  three 
missions.  These  three  missions  are  the  Missile  Warning  (MW),  the  Air  Warning  (AW)  and 
the  Space  Warning  (SW)  missions.  The  strings  associated  with  each  of  these  missions  are 
shown  in  figure  2.  Ilie  following  methodology  does  not  address  transit  times  for  the  sensors 
nor  communication  propagation  times.  Message  transmission  times  are  treated  as  constants. 
Future  woik  will  address  the  variations  in  transmission  times  for  the  various  links. 
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Figure  1.  Transit  Time 
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Figure  2.  The  Strings 


1.4  APPROACH 


Since  each  of  the  CMU  subsystem  specifications  were  developed  independently,  no 
statistical  consistency  in  the  transit  time  specifications  exists.  Furthermore,  the  requirements 
for  missile,  air  and  space  warning  information  delivery  time  have  not  been  specifial.  The 
approach  recommended  in  this  paper  for  specifying  the  overall  information  delivery  time  uses 
rectangular  distributions  to  represent  the  subsystem  specifications.  Convolution  of  the 
rectangular  distributions  approximating  the  subsystem  transit  times  is  the  preferred  process 
for  deriving  the  overall  information  deUvery  times.  Since  the  convolution  of  a  large  number 
of  distributions  is  difficult,  the  Central  Limit  Theorem  is  used  to  approximate  the  convolution 
of  these  rectangular  distributions.  The  resultant  distribution,  derived  from  the  convolution 
product,  represents  the  overall  information  delivery  time  distribution. 

The  following  sections  address  the  overall  information  delivery  time  derivation 
methodology.  An  overview  of  the  subsystem  transit  time  specification  can  be  found  in 
section  2.  devious  analysis  is  investigate  along  with  the  original  assumptions  used  in  that 
work  in  section  3.  In  section  4,  an  investigation  of  the  SCIS  test  data  is  explored.  In  section 
5,  approximate  solutions  are  explored  and  explained.  Conclusions  and  recommendations 
follow  in  sections  6  and  7. 


A 


SECTION  2 


THE  OVERALL  INFORMATION  DELIVERY  TIME  SPECIFICATION 

PROBLEM 


2.1  THE  TRANSIT  TIME  SPECIFICATIONS 

In  each  of  the  subsystems,  the  transit  time  is  specified  differently.  CSSR  specified 
transit  time  at  the  99.8  percent  and  99.99  percent  levels.  The  SCIS  program  required  the 
messages  to  transit  the  subsystem  in  “a”  seconds  99  percent  of  the  time  with  a  maximum  of 
“b”  seconds.  A  total  queue  time  was  also  specified.  The  SPADOC  subsystem  specified  a 
mean  and  maximum  transit  time.  The  CCPDS-R  program  specified  the  98  percent  value. 
Granite  Sentry  specified  a  Space  Warning  transit  time  at  95  percent  and  an  Air  Warning 
Transit  time  at  95  percent.  Since  the  specification  values  are  classified,  the  real  values  of  the 
specifications  have  been  replaced  with  variables  as  place  holders  in  figure  3. 


2.2  LACK  OF  INFORMATION  IN  SUBSYSTEM  SPECIFICATION 

In  three  of  the  subsystems,  the  specification  of  the  transit  time  is  given  as  one 
percentile  only,  making  it  difficult  for  the  analyst  to  determine  the  transit  time  distributions. 
The  provision  of  one  point  is  not  adequate  to  determine  the  normal  curve  for  that  subsystem 
since  the  normal  curve  is  completely  determined  by  its  mean  and  standard  deviation.  The 
determination  of  these  two  parameters  by  solving  two  simultaneous  equations  needs  two 
percentile  points.  Without  a  second  point,  guesses  have  to  be  made  about  the  normal 
distribution  parameters.  These  guesses  may  significantly  affect  the  resultant  information 
delivery  time  distribution  as  shown  in  figures  4  and  5. 

The  guesses  can  have  a  significant,  and  inaccurate  impact,  especially  for  distributions 
that  are  clustered  around  the  positive  side  of  zero.  In  the  past,  when  the  assumption  of  a 
normal  distribution  was  employed,  negative  transit  times  were  sometimes  found  for 
reasonable  percentiles.  There  was  no  computaticmal  error  in  these  calculations.  The  errOT  was 
in  the  assumptions.  These  negative  transit  time  values  arose  when  both  a  mean  and  a  variance 
were  assumed  for  the  normal  distribution.  These  spurious  findings  led  to  further 
investigations  of  the  transit  time  methodology. 


2.3  SPECIFICATION  OF  OVERALL  INFORMATION  DELIVERY  TIME 

The  problem  is  to  derive  the  overall  information  delivery  time  specification  from  these 
program  specifications.  A  graphic  depiction  of  the  meaning  of  the  overall  information 
deliveiy  time  is  given  in  figure  6.  The  difficulty  associated  with  this  task  is  that  there  is  very 
little  information  for  the  analyst  to  use  in  performing  this  derivation. 

In  the  next  section  the  original  approach  is  discussed,  along  with  its  basic 
assumptions  and  its  methodology. 


5 


M.  H.  Weeden 


0) 

T3 

DC  o 

CO  0 
Q  W 


n:  §1 

<0  S  “ 

CO  o'® 

^  ssg 

00  S 

.  O) 
05 

o>8) 


o  R  0 

CO  00  =3 
^  0  S  0 

O  W  2  3 
CO  0*0  O' 

s§ 

05  ^ 


42  w 

1  = 
8  8 
0  o 

w  S 
0  S 


6 


Figure  3.  The  Specification  Values 
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Figure  4.  Effect  of  Error  in  Estimating  Mean 
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Figure  5.  Effect  in  Error  in  Estimating  Variance 
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Figure  6.  Overall  Information  Delivery  Time  for  Missile  Warning 


SECTION  3 
PREVIOUS  STUDIES 


3.1  THE  ORIGINAL  APPROACH 

The  current  approach  to  determining  the  overall  CMU  information  delivery  time 
specification  for  the  APB  is  to  1)  assume  that  the  transit  time  distribution  for  each  subsystem 
is  Gaussian  or  normal,  2)  find  the  mean  and  variance  for  each  of  these  distributions  firom  the 
specification  values,  3)  normalize  each  of  the  subsystem  transit  times  to  99.8  percent,  and 
then  4)  add  these  99.8  percent  values  to  get  the  overall  information  delivery  time  for  a 
particular  string. 


3.2  ORIGINAL  APPROACH  ASSUMPTIONS 

One  of  the  disadvantages  of  the  current  approach  is  that  the  normal  distribution  has  a 
negative  tail.  The  existence  of  a  negative  portion  of  the  distribution  means  that  it  is  possible 
for  messages  to  take  negative  time  in  transiting  a  subsystem.  In  addition,  a  number  of  the 
subsystems  have  small,  specified  transit  times.  Using  tiie  current  approach,  negative  values 
for  reasonable  percentiles  during  normalization  have  been  found.  These  findings  are  a  clear 
indication  that  the  assumption  of  the  normal  curve  may  be  incorrect,  at  least  for  some  of  the 
subsystems. 

A  second  shortcoming  with  the  current  approach  is  the  addition  of  the  99.8  percent 
values  of  each  of  the  individual  message  transit  times  for  the  subsystems  in  the  particular 
string  under  consideration  to  get  an  overall  99.8  percent  information  delivery  time. 
Mathematical  theOTems  exist  that  say  1)  the  means  of  any  distributions  can  be  added  to  get  an 
overall  mean,  and  2)  if  the  subsystems  are  independent,  the  variances  can  be  added  to  yield  a 
variance  for  the  overall  distribution.  Except  for  means,  and  in  the  case  of  independent 
subsystems,  variances,  the  sum  of  random  variables  x  and  y  should  be  found  using 
convolution  or  an  approximation  to  the  convolution. 


3.3  METHOD  FOR  ADDING  RANDOM  VARIABLES 

The  method  for  adding  random  variables  is  specified  in  Probability,  Random 
Variables,  and  Stochastic  Processes  by  Papoulis.  [1]  lire  Fundamental  theorem  states:  “If 
the  random  variables  x  and  y  are  independent,  then  the  density  of  their  sum  z  =  x  +  y  equals 
the  convolution  of  their  respective  densities."  [  1]  By  densities,  Papoulis  means  distiibutitms, 
or  basically  histograms.  (A  histogram  is  a  graph  with,  in  this  case,  transit  times  on  the 
independent  axis  and  counts  of  the  occurrences  of  these  times  along  the  dependent  axis.)  The 
Fundamental  Theorem  means  that  the  analyst  can  not  add  transit  times  at  a  specific  percentile 
from  different  distributions  and  report  the  sum  as  either  a  worst  case  or  the  corresponding 
percentile. 
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3.4  CLARIFYING  EXAMPLE 


The  analyst  needs  to  convolve  the  probability  density  functions  for  the  subsystems 
because  of  the  nature  of  the  statistical  problem.  To  clarify,  consider  a  simple  problem. 
Suppose  that  there  are  two  random  variables,  x  and  y.  Suppose  that  their  sum  is  z,  i.e. 
z=x+y.  Also  suppose  that  z  equals  10  and  consider  only  the  integers  between  1  and  10.  Then 
the  sum,  10,  can  be  acquired  by  the  sum  of  either  1  and  9,  or  2  and  8,  or  3  and  7,  and  so 
on.  This  is  the  same  as  either  1  and  (10-1),  or  2  and  (10-2),  or  3  and  (10-3),  etc.  Thus,  the 
sum  10,  can  be  arrived  at  through  the  sum  of  a  number  of  different  values.  Since  we  are 
looking  for  the  probability  of  the  occurrence  of  10,  the  sum  of  the  product  of  the  probabilities 
of  the  various  summands  is  the  probability  of  the  occurrence  of  10.  The  above  algorithm  is  a 
simple  example  of  the  discrete  convolution  product.  There  are  two  exceptions  for  adding 
random  variables.  It  can  be  shown  that  the  means  can  be  added  to  obtain  the  mean  of  the 
overall  distribution  [1].  In  the  case  of  independence,  variances  can  also  be  added  to  get  the 
variance  of  the  overall  distribution  [IJ. 


3.5  COUNTEREXAMPLE 

In  order  to  further  examine  this  counterintuitive  concept,  consider  the  following 
counterexample  depicted  in  figure  7.  In  that  example,  the  addition  of  the  90  percent  value  in 
the  first  distribution  and  the  90  percent  value  in  the  second  distribution  gives  tiie  81  percent  in 
the  overall  distribution.  In  the  figure,  the  system  consists  of  two  simple  subsystems,  A  and 
B.  In  subsystem  A,  messages  transit  the  subsystem  in  three  discrete  times.  These  times  are  1, 
2,  and  3  seconds.  A  count  of  the  number  of  messages  that  transited  subsystem  A  in  1  second 
reveals  that  they  occur  with  a  probability  of  .6.  (This  means  that  if  100  messages  were  sent 
through  subsystem  A,  that  it  is  likely  that  60  would  get  through  in  1  second.)  In  subsystem 
B,  90  percent  of  the  messages  get  through  in  I  second,  none  get  through  in  2  seconds,  and 
100  percent  get  through  in  three  seconds.  Convolution  of  the  two  distributions  produces  the 
overdl  distributions  on  the  right  of  the  figure. 

The  values  for  the  convolution  are  found  at  the  bottom  of  the  figure.  For  the  overall 
transit  time  of  2  seconds,  there  is  only  one  way  to  get  a  transit  time  of  2  seconds.  This  way  is 
to  transit  the  first  subsystem  in  one  second  and  the  second  subsystem  in  one  second.  TTie 
probabilities,  .6  and  .9,  are  multiplied  to  get  .54.  For  a  transit  time  of  3  seconds,  either  the 
message  transited  the  first  subsystem  in  1  second  and  the  second  subsystem  in  2  seconds  or 
the  first  subsystem  in  2  seconds  and  the  second  in  1  second.  The  probabilities  for  these  are 
displayed  and  add  to  .27.  This  technique  is  continued  for  all  possibilities. 

Observe  that  even  though  90  percent  of  the  messages  transit  the  first  subsystem  in  2 
seconds  and  the  second  in  1  second,  only  81  percent  of  the  messages  transit  the  overall 
system  in  3  seconds,  the  sum  of  1  and  2  seconds.  This  simple  example  shows  that  at  least  in 
some  cases,  convolution  produces  a  different  result  than  simply  adding  the  90  percent  values. 
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3.6  ORIGINAL  METHODOLOGICAL  SUMMARY 


Thus,  the  original  nnethodology  assumed  that  the  transit  time  distributions  were 
normal.  Then,  when  necessary,  means  and  variances  for  these  distributions  were  assumed. 
The  99.8  percent  defacto  stan^d  found  by  normalization  techniques  was  calculated,  and 
then  these  99.8  percent  values  for  each  of  Ae  subsystems  were  added  in  the  order  in  which 
messages  traverse  the  system.  This  sum  became  the  overall  information  delivery  time.  In  the 
next  sections,  the  updated  methodology  is  presented. 


lA 


SECTION  4 


INVESTIGATION  OF  TEST  DATA  DISTRIBUTIONS 


4.1  SAMPLING  TECHNIQUE 

Early  in  the  methodology  update  analysis,  samples  of  the  SCIS  transit  time  data  were 
taken.  The  intent  was  to  determine  the  transit  dme  distribution(s)  from  SQS  test  data.  Five 
histograms  of  100  message  transit  times  were  developed  and  compared  to  determine  if  there 
was  significant  variance  between  samples.  The  observed  variation  in  samples  was  within 
reasonable  limits. 


4.2  HISTOGRAM  DEVELOPMENT 

A  composite  histogram  of  the  500  data  points  was  compiled  from  the  individual 
histograms  and  the  mean  and  variance  of  the  test  data  found.  The  resultant  histogram,  with 
some  trimodal  characteristics  evident,  is  shown  in  figure  8.  For  the  sample  size  of  500  test 
values,  the  corresponding  Gaussian  distribution  is  shown  overlaid  on  the  test  data.  CHie 
CSSR  project  also  provided  the  analyst  with  the  mean  and  variance  of  CSSR  test  data  for 
message  transit  times.  These  were  used  as  a  “zeroth  order  filter”  to  a  candidate  set  of 
distributions  as  is  explained  below.  Preliminary  calculations  indicate  that  the  CSSR  message 
transit  times  may  be  Erlang.  A  comparison  of  the  possible  CSSR  Erlang  distribution  is  also 
shown  in  the  figure  9.) 


4.3  CANDIDATE  DISTRIBUTIONS 

A  list  of  possible  distributions  that  might  fit  the  SCIS  test  data  histogram  was 
postulated.  These  candidate  distributions  were  1)  the  Erlang,  2)  the  Exponential,  3)  the 
displaced  Exponential,  4)  the  Normal  or  Gaussian,  5)  the  Maxwell,  6)  the  Rayleigh,  and  7) 
the  Gamma  function.  These  distributions  are  depict^  in  figures  10  through  16. 

The  Erlang  was  developed  to  model  telephone  situations  and  was  deemed  to  be  a 
good  candidate.  In  the  Erlang  distributions,  messages  arrive  at  a  service  or  processing  facility 
where  k  distinct  phases  of  the  service  must  be  p^ormed  on  each  customer.  If  the  time  to 
perform  each  phase  has  an  exponential  distribution  and  is  independent  of  the  time  to  perform 
each  of  the  phases,  then  the  total  time  to  perform  all  k  phases  of  service  has  the  Erlang 
distribution. 

The  normal  distribution  was  selected  as  a  possibility  because  1)  it  was  currently  being 
used  and  2)  if  the  mean  were  far  enough  to  the  right,  it  might  not  be  a  bad  choice.  It  is  well 
known  that  in  situations  where  a  lot  of  unknown  variables  are  working  on  a  parameter,  that 
the  Gaussian  is  often  a  good  choice. 
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Figure  9.  CSSR  Test  Data  -  Preliminary  Results 
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Figure  10.  The  Erlang  Distribution 
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15.  The  Rayleigh  Distribution 
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16.  The  Gamma 


The  exponential  and  displaced  exponential  were  chosen  because  these  are  the  standard 
distributions  used  in  queueing  theory  applications  for  the  service  element  in  a  network.  The 
Gamma  function  was  select^  by  statisticians  because  of  its  generality;  the  Maxwell  and 
Rayleigh  distributions  were  added  for  completeness. 


4.4  STATISTICAL  CALCULATIONS 

A  "zeroth  order  filter",  better  known  as  engineering  judgement,  was  applied  to  the 
candidate  distributions.  The  "zeroth  order"  approach  involved  calculating  the  defining 
parameters  of  each  of  the  candidate  distributions  by  using  the  mean  and  variance  of  the  test 
data  histogram.  If  the  defining  parameters  were  not  judged  reasonable,  i.e.  if  they  were  too 
large  or  too  small,  the  distribution  was  discarded.  Before  using  the  filter,  the  500  SCIS 
transit  times  were  added  and  the  average  found.  The  variance  for  these  data  was  also  found. 
These  two  statistics  were  used  to  filter  the  set  of  candidate  distributions  in  the  following 
manner. 

The  calculations  for  the  gamma  distribution  are  shown  in  figure  17.  It  can  be  shown 
that  the  gamma  distribution  has  a  mean  and  variance  E(T)  and  var(T)  equal  to  a*b  and 
a*b**2,  respectively.  The  parameters,  a  and  b,  are  the  defining  parameters  for  the  gamma 
function.  In  the  study,  the  mean  found  from  the  data  is  set  equal  to  a’*‘b  and  the  variance 
found  from  the  data  was  set  equal  to  a*b**2.  With  the  values  found  from  the  data,  the 
defining  parameters,  a  and  b,  were  calculated  to  be  50  and  .029  respectively.  Just  as  one 
would  not  believe  that  a  polynomial  of  degree  50  is  a  good  fit  to  the  data,  these  calculated 
values  for  the  defining  parameters  indicate  that  the  gamma  function  probably  does  not 
represent  the  data. 

The  next  candidate  distribution  to  be  considered  was  the  Erlang  distribution.  The 
calculations  for  this  woik  are  shown  in  figure  18.  The  mean  and  variance  for  the  data  were 
also  set  equal  to  the  defining  parameters  of  this  distribution.  The  mean,  E(T) ,  was  set  equal 
to  1/u  and  the  Var(T)  to  l/(k*u**2).  The  value  for  u  found  was  .83  and  the  value  for  k, 
50.05.  These  values  also  in^cate  that  the  Erlang  distribution  was  not  a  viable  distribution  for 
the  SCIS  test  data. 

Calculations  for  the  displaced  exponential  distribution  are  shown  in  figure  19.  The 
approach  used  in  this  analysis  is  slightly  different  fiom  that  taken  above.  In  tltis  approach, 
three  properties  of  probability  density  functions  are  employed.  These  are  1)  that  the  tc^  area 
under  the  curve  is  equal  to  1, 2)  that  the  mean  is  equal  to  the  integral  off*  f(t),  and  3)  that  the 
variance  is  the  integral  of  t’'"*2  '*  f(t)  minus  the  mean  squared.  Using  these  well  known 
properties,  the  values  of  the  defining  parameters  were  found.  The  parameter,  a,  the  point  at 
which  the  density  function  starts,  was  found  to  be  1.025.  The  value  of  lamb^  was  5.7  and 
the  value  of  k,  1^3.  The  parameter,  k,  seemed  too  large,  and  this  distribution  was  placed  in 
a  “maybe”  categtny. 

In  order  to  determine  if  the  distribution  was  normal,  a  range  of  means  and  variances 
was  chosen.  Since  the  99  percent  value  was  known  from  the  spec&cation,  the  normalization 
equation,  given  in  figure  20,  was  set  equal  to  2.33,  the  99.8  percent  value  of  the  normalized 
curve.  If  the  data  were  normal,  the  99.8  percent  normalized  value  derived  for  the  observed 
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Figure  17.  Sample  Calculations  -  Gamma 
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Figure  19.  Sam^e  Calculations  -  Displaced  Exponential 
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Figure  20. 


mean  and  the  standard  should  approximate  2.33.  The  values  found  are  shown  in  figure  20. 
Letters  have  been  substituted  for  the  actual  values  of  the  means  and  variances.  One 
reasonable  value,  calculated  from  the  normalization  equation,  was  found  just  outside  of  the 
postulated  acceptable  range  of  means  and  variances.  The  normal  curve  was  placed  in  a 
“maybe”  category.  Further  investigation  using  the  Chi  Square  statistic  eliminated  this 
candidate  distribution. 

In  a  similar  manner,  both  the  Rayleigh  and  Maxwell  distributions  were  eliminated  as 
shown  in  figure  21  and  22.  In  these  calculations  the  mode  of  each  distribution  is  assumed  to 
occur  at  the  SCIS  test  data  mean  transit  time.  The  question  was  then  asked:  Does  the  mode 
of  124/500,  the  SCIS  test  data  peak,  equal  or  nearly  equal  the  peak  value  for  the  proposed 
distributions?  The  answer  in  both  cases  is  no.  These  two  distributions  were  eliminated  as 
candidates. 


4.5  RESULTS  OF  JUDGMENTAL  FILTER 

The  zeroth  order  calculations  indicated  that  no  distribution  passed  the  filter.  The 
results  from  applying  the  judgmental  filter  are  summarized  in  figure  23.  Since  no  definitive 
density  function  was  found,  a  new  approach  based  on  approximations  using  the  specification 
values  was  pursued.  This  approach  proved  fruitful  and  is  described  in  the  following  sections. 


30 


31 


Figure  21.  Sample  Calculations  -  Rayleigh 
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Figure  23.  Zeroth  Order  Results  -  Summary 


SECTION  5 


THE  SEARCH  FOR  APPROXIMATE  SOLUTIONS 


5.1  THE  TRIANGULAR/TRAPEZOID  APPROXIMATION 

Having  found  no  suitable  distribution  for  the  SCIS  test  data,  a  search  for 
approximations  began.  The  piecewise  linear  approximation  was  developed  In  this  approach, 
specification  values  are  used  to  determine  a  triangular,  or  set  of  triangular,  distributions  that 
“encloses”  the  test  data.  Instead  of  trying  to  find  a  curve  that  fit  the  data,  a  bound  or 
enclosure  of  the  test  data  was  sought,  llien,  the  test  data  can  vary  within  the  enclosure  and 
have  any  distribution  without  affecting  the  approximate  specirication  disnibution. 

In  the  triangular  distribution,  the  area  of  the  triangle,  (or  multiple  triangles,  or  one 
triangle  and  one  trapezoid)  was  set  equal  to  the  specified  percentage.  TTius,  T  a  transit  time 
were  specified  as  x  seconds  in  99  percent  of  the  time,  then  the  area  of  the  triangle  was  .99 
and  the  end  point  or  vertex  of  the  triangle  was  the  value  of  the  transit  time  at  that 
specification,  x.  Using  the  information  about  the  area  and  the  endpoints,  the  equation  of  the 
lines  of  each  of  the  triangles  was  found.  From  this  equation  of  the  line,  any  percentage  value 
could  be  found.  A  sample  triangular  distribution  is  given  in  figure  24. 

Even  though  this  approximation  technique  did  not  produce  a  suitable  distribution,  the 
ambiguities  or  lack  of  information  in  the  subsystem  specifications  was  highlighted.  In  a 
number  of  the  subsystems,  the  length  of  the  tail  was  unknown.  It  was  impossible  to 
determine  the  location  of  the  mode  or  the  mean  for  most  of  the  specifications.  TTiis  was  the 
same  problem  encountered  in  the  earlier  analysis.  The  location  of  ^e  initial  or  smallest  transit 
time  was  difficult.  The  number  of  modes  in  the  distribution  was  also  unknown.  The 
triangular  distribution  investigation  pointed  out  how  little  we  know  about  the  distribution. 
The  investigation  of  the  triangular  approximation  was  abandoned  and  the  exploration  of  the 
uniform  or  rectangular  distributions  began. 


5.2  THE  RECTANGULAR/UNIFORM  DISTRIBUTION  APPROXIMATION 

Theory  indicates  that  when  little  is  known  about  the  distribution,  a  uniform 
distribution,  see  figure  25,  is  preferred.  In  Introduction  to  Simulation  and  SLAM  Pritsker 
and  Pegden  [2]  say,  "The  use  of  the  uniform  distribution  often  implies  a  complete  lack  of 
knowl^ge  concerning  the  random  variable  other  than  it  is  between  a  minimum  and  a 
maximum  value."  The  uniform  distribution  is  a  constant  value  between  an  beginning  value 
and  an  end  value.  This  means  that  if  the  probability  density  function  is  constant  over  the 
interval,  then  the  probability  of  each  transit  time  occiuring  is  equally  likely.  Moreover,  the 
area  under  the  uniform  distribution  is  one.  Since  the  specifications  do  not  ^ways  provide  a 
maximum  value  for  the  distribution,  the  rectangular  distribution  is  used  in  place  of  the 
uniform  distribution  for  some  subsystems.  The  differences  between  the  uniform  and  the 
rectangular  distribution  are  1)  the  area  under  a  rectangular  distribution  does  not  need  to  be 
one,  and  2)  the  rectangular  distribution  can  be  the  concatenation  of  a  number  of  rectangular 
distributions  as  shown  in  figure  26. 
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The  choice  of  the  rectan^lar  or  uniform  distribution  for  the  transit  time  distribution 
has  a  number  of  advantages.  Since  each  of  the  transit  times  is  equally  likely,  no  bias  is 
introduced  by  the  analyst  by  making  assumptions  about  the  mean  or  the  variance.  The 
uniform  (and  rectangular)  distributions  have  a  minimum  nonzero  value  so  that  there  is  zero 
probability  of  negative  transit  times. 


5.3  DETERMINATION  OF  OVERALL  INFORMATION  DELIVERY  TIME 

In  order  to  determine  the  overall  information  delivery  time,  the  new  methodology 
proposes  that  the  best  approximation  to  the  overall  delivery  time  will  be  found  by  Ae 
convolution  of  the  uniform  or  rectangular  distributions  to  acquire  an  overall  distribution.  This 
approach  is  consistent  with  The  Fundamental  Theorem  on  page  189  of  Probability,  Random 
Variables,  and  Stochastic  Processes  by  Papoulis  [1].  This  Aeorem  states:  “If  the  random 
variables  x  and  y  are  independent,  then  the  density  of  their  sum  z  =  x  +  y  equals  the 
convolution  of  their  respective  densities.'* 

By  densities,  Papoulis  means  distributions,  or  basically  histograms.  (A  histogram  is  a 
graph  with,  in  this  case,  transit  times  on  the  independent  axis  and  counts  of  the  occurrences 
of  Aese  times  along  the  dependent  axis.)  The  Fundamental  Theorem  means  that  the  analyst 
can  not  add  transit  times  at  a  specific  percentile  from  different  distributions  and  report  the 
sum  as  either  a  worst  case  or  the  corresponding  percentile.  There  are  two  exceptions.  It  can 
be  shown  that  the  means  and,  in  the  case  of  independence,  variances  can  be  added  to  obtain 
the  mean  and  variance  of  the  overall  distribution. 

With  these  theorems  in  mind,  the  overall  information  delivery  time  density  function 
can  be  found  by  convolving  the  transit  time  distributions  for  each  subsystem  as  shown  in 
figure  27.  The  number  of  convolutions  and  the  distributions  involved  in  those  convolutions 
are  dependent  upon  what  subsystems  are  in  each  string.  In  the  case  of  the  missile  warning 
string,  there  are  eight  traversals  of  the  various  subsystems;  for  air  and  space,  there  are  six. 
This  means  that  the  resultant  distribution  is  the  convolution  of  either  six  or  eight 
distributions.  Since  convolution  is  difficult,  especially  for  a  convolution  of  six  or  eight 
distributions,  an  approximation  to  the  overall  distribution  using  the  Central  Limit  Theorem 
can  be  applied,  if  an  assumption  of  independence  of  each  subsystem  is  made,  and  if  the 
variances  of  each  of  the  distributions  are  similar. 

The  Central  Limit  Theorem  says. 

Let  xl,  x2,  x3,  ...,  xn  be  independent  random  variables.  Then  whatever  the 
form  of  their  distribution-subject  to  certain  very  general  conditions-the  sum  of 
xn  approaches  a  normal  distribution  as  n  increases  without  bound.  This  normal 
distribution  has  mean  equal  to  the  sum  of  the  means  and  variance  equal  to  the 
sum  of  the  variances  of  the  n  random  variables.  {Mathematics  Dictionary, 
James  and  James,  1968,  p.44)  [3] 
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Figure  27.  Overall  Information  Delivery  Time  Density  Function 


Papoulis  shows  that  convolution  of  a  number  of  distributions,  including  uniform 
distributions,  quickly  produces  a  curve  that  approximates  the  normal  curve.  (Significant 
differences  may  occur  in  the  tails  in  some  cases,  especially  if  the  one  of  the  curves  in  the 
convolution  has  a  large  variance  relative  to  the  o^er  variances.  If  this  situation  occurs,  as  it 
did  in  our  analysis,  then  convolution  of  the  transit  time  distribution  with  the  large  variance 
should  be  convolved  with  the  Central  Limit  Theorem  approximation  to  the  craivolution  of  the 
remaining  distributions.)  Papoulis  performs  a  sample  convolution  of  the  uniform  distribution 
with  itself  on  pages  267  and  268  of  Probability,  Random  Variables,  and  Stochastic 
Processes.  [1]  This  convolution  is  shown  in  tigure  28.  The  first  convolution  produces  a 
triangular  distribution.  Of  the  results  of  the  uniform  with  the  triangular  distribution,  Papoulis 
says,  “...after  just  two  convolutions,  the  resultant  curve  is  remarkably  close  to  the  normal 
curve.” 


To  use  the  Central  Limit  Theorem,  the  mean  and  variance  of  each  subsystem's 
uniform  distribution  must  be  found.  This  is  done  by  using  the  transit  time  provided  in  the 
system  specification.  If,  for  example,  the  maximum  transit  time  for  a  hypothetical  subsystem 
were  100  seconds  and  the  minimum  0  seconds,  then  the  probability  density  function  for  this 
distribution  would  be  a  constant,  uniform  distribution  of  1/100.  This  constant  function, 
multiplied  by  x,  is  then  integrated  from  zero  to  the  end  point  of  the  uniform  distribution  to 
determine  the  distribution's  mean.  To  find  the  variance,  the  constant  probability  density 
function  is  multiplied  by  x**2  and  integrated  over  the  same  interval.  The  mean  squared  is 
subtracted  fi-om  the  result.  These  are  standard  methods  for  finding  the  mean  and  variance  of  a 
distribution  and  are  depicted  in  figure  29. 

After  the  mean  and  variance  for  each  subsystem's  uniform,  or  rectangular 
distribution,  are  determined,  they  are  added  up  over  the  missile,  air  and  space  strings  to 
acquire  the  mean  and  variance  for  the  overall  message  delivery  time  for  those  strings 
respectively.  It  is  correct  to  add  the  means  of  probability  density  functions.  {Probability, 
Random  Variables,  and  Stochastic  Processes  p.  143  of  Papoulis  [  1]).  It  is  correct  to  add  the 
variances  of  probability  density  functions,  if  the  subsystems  are  independent.  {Probability, 
Random  Variables,  and  Stochastic  Processes  p.  211,  Papoulis  [1]).  Since  the  convolution 
product  is  nearly  normal  and  since  the  normal  curve  is  completely  determined  by  its  mean  and 
variance,  the  99.8  percent  value  for  the  overall  message  dehvery  time  can  be  found  by  adding 
2.88  times  the  standard  deviation  to  the  mean.  The  value,  2.88,  is  the  99.8  percent  tabled 
value  for  the  normalized  Gaussian.  Using  this  methodology,  any  other  percentile  value  can 
also  be  found  for  the  overall  message  delivery  times. 

5.4  THE  EFFECT  OF  LARGE  VARIANCE 

In  case  of  the  space  warning  system,  a  large  variance  in  the  uniform  distribution 
caused  an  inaccuracy  in  the  tails  of  the  overall  message  delivery  normal  curve  when  using  the 
Central  Limit  Theorem  approximation  to  the  overall  information  delivery  time.  When  the 
message  delivery  time  was  found  by  convolving  the  SPADOC  subsystem’s  uniform 
distribution  with  the  approximation  to  the  normal  curve  found  from  the  other  subsystem's 
distributions  using  the  Central  Limit  Theorem  approach,  a  value  smaller  than  the  current 
estimate  occurred.  This  convolution  product  is  more  accurate  and  is  the  preferred  answer  in 
this  case. 
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Figure  28.  Sample  Convolution 
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Figure  29.  Standard  Methods  for  Calculating  Mean  and  Variance 


5.5  THE  SEPARATION  OF  SPECIFICATION  DETERMINATION  FROM 
PERFORMANCE  ESTIMATION 

Determination  of  the  specification  values  for  the  APB  values  and  the  quarterly 
estimates  should  be  separated.  This  separation  is  shown  graphically  in  figure  30.  The  overall 
message  delivery  time  specification  for  the  APB  should  be  found  by  the  convolution  (or 
Central  Limit  Theorem  approach)  of  uniform  distributions  as  described  above.  The 
distributions  for  this  work  should  be  uniform,  or  rectangular,  distributions  drawn  fi^om  the 
subsystem  specifications.  To  find  the  quarterly  estimates,  real  test  data,  or  the  means  and 
variances  of  the  real  test  data,  should  be  used  in  conjunction  with  the  Central  Limit  Theorem 
approach.  For  those  programs  that  can  not  supply  either  real  data  or  the  means  and  variance 
from  real  data,  the  specification  uniform  distribution  can  be  substituted  until  the  real  data 
becomes  available. 
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Figure  30.  Separation  of  Specification  Determination  from  Performance  Estimation 


SECTION  6 
CONCLUSIONS 


6.1  COMPARISON  OF  THE  METHODOLOGIES 

The  above  methodology  differs  from  the  current  methodology.  A  comparison  of 
99.8  percent  values  shows  that  1)  in  the  missile  warning  case,  a  -11  percent  difference 
occurs;  2)  in  the  air  warning  system,  a  -9.1  percent  change  is  found;  and  3)  in  the  space 
warning  system,  a  -2.3  percent  change  is  determined.  The  new  methodology  produces  a 
smaller  overall  message  delivery  time  for  all  three  strings.  The  average  decrease  was 
-7.5  percent.  See  table  1. 


6.2  STRENGTHS  AND  WEAKNESSES  OF  THE  UPDATED  METHODOLOGY 

The  use  of  the  new  methodology  invoking  the  convolution  of  uniform  or  rectangular 
distributions,  and  consequently  the  Central  Limit  Theorem,  has  enhanced  statistical  rigor.  If 
the  means  and  variances  of  the  test  data  are  supplied  by  the  contractors,  the  quarterly 
estimates  are  simple  and  easy  to  do. 

Certain  assumptions  are  associated  with  this  methodology.  These  are  1)  that  the 
subsystems  are  independent  and  2)  that  no  subsystem  has  a  distribution  with  a  long  tail. 
Long  tails  would  arise  in  situations  were  there  is  “clogging”  in  the  system.  The  lack  of 
independence  of  subsystems  could  arise  if  the  transit  time  of  a  message  through  one 
subsystem  affected  its  transit  time  through  the  next  subsystem.  Thus,  some  sort  of  history  of 
what  had  transpired  would  negate  the  independence  assumption. 

The  reader  should  note  that  the  uniform,  or  rectangular,  distribution  approach  is  an 
approximation  to  the  overall  information  delivery  time  distribution.  As  more  information 
b^omes  available,  improvements  to  the  methodology  should  be  considered. 


6.3  RECOMMENDATIONS 

The  Central  Limit  Theorem  approach  to  approximating  the  overall  information 
delivery  time  was  presented  to  several  audiences.  It  was  generally  recognized  that  the  new 
approach  possessed  enhanced  statistical  rigor.  The  approach  was  recommended  as  a 
replacement  for  the  existing  methodology. 
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SECTION  8 
GLOSSARY 


APB 

AW 

CMU 

CSSR 

DCP 

nw/AA 

MW 

PMD 

SCIS 

SORD 

SPADOC 

SW 

TEMP 


Acquisition  Program  Baseline 
Air  Warning 

Cheyenne  Mountain  Upgrade 

Communication  System  Segment  Replacement 

Decision  Coordinating  Paper 

Inte^ted  Tactical  Waming/Attack  Assessment 

Missile  Warning 

Program  Management  Directive 

Survivable  Communications  Integration  System 

System  Operational  Requirement  Document 

Space  Defense  Operations  Center 

Space  Warning 

Test  and  Evaluation  Master  Plan 
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